side-channel attack
- Asia > India (0.05)
- North America > Canada (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Whisper Leak: a side-channel attack on Large Language Models
McDonald, Geoff, Or, Jonathan Bar
Large Language Models (LLMs) are increasingly deployed in sensitive domains including healthcare, legal services, and confidential communications, where privacy is paramount. This paper introduces Whisper Leak, a side-channel attack that infers user prompt topics from encrypted LLM traffic by analyzing packet size and timing patterns in streaming responses. Despite TLS encryption protecting content, these metadata patterns leak sufficient information to enable topic classification. We demonstrate the attack across 28 popular LLMs from major providers, achieving near-perfect classification (often >98% AUPRC) and high precision even at extreme class imbalance (10,000:1 noise-to-target ratio). For many models, we achieve 100% precision in identifying sensitive topics like "money laundering" while recovering 5-20% of target conversations. This industry-wide vulnerability poses significant risks for users under network surveillance by ISPs, governments, or local adversaries. We evaluate three mitigation strategies - random padding, token batching, and packet injection - finding that while each reduces attack effectiveness, none provides complete protection. Through responsible disclosure, we have collaborated with providers to implement initial countermeasures. Our findings underscore the need for LLM providers to address metadata leakage as AI systems handle increasingly sensitive information.
- North America > United States (0.04)
- North America > Canada (0.04)
- Asia (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Asia > India (0.05)
- North America > Canada (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference
Chu, Kexin, Lin, Zecheng, Xiang, Dawei, Shen, Zixu, Su, Jianchang, Chu, Cheng, Yang, Yiwei, Zhang, Wenhui, Wu, Wenfei, Zhang, Wei
--Global KV-cache sharing has emerged as a key optimization for accelerating large language model (LLM) inference. However, it exposes a new class of timing side-channel attacks, enabling adversaries to infer sensitive user inputs via shared cache entries. Existing defenses, such as per-user isolation, eliminate leakage but degrade performance by up to 38.9% in time-to-first-token (TTFT), making them impractical for high-throughput deployment. T o address this gap, we introduce SafeKV (Secure and Flexible KV Cache Sharing), a privacy-aware KV-cache management framework that selectively shares non-sensitive entries while confining sensitive content to private caches. SafeKV comprises three components: (i) a hybrid, multi-tier detection pipeline that integrates rule-based pattern matching, a general-purpose privacy detector, and context-aware validation; (ii) a unified radix-tree index that manages public and private entries across heterogeneous memory tiers (HBM, DRAM, SSD); and (iii) entropy-based access monitoring to detect and mitigate residual information leakage. Our evaluation shows that SafeKV mitigates 94%-97% of timing-based side-cahnnel attacks. Compare to per-user isolation method, SafeKV improves TTFT by up to 40.58% and throughput by up to 2.66 across diverse LLMs and workloads. By combining fine-grained privacy control with high cache reuse efficiency, SafeKV reclaims the performance advantages of global sharing while providing robust runtime privacy guarantees for LLM inference. Large language models (LLMs) now underpin applications from dialogue to complex reasoning. To meet time-sensitive inference demands, key-value (KV) caching stores intermediate attention states ("keys" and "values") to eliminate redundant computation for sequential or similar prompts, thereby accelerating generation [70]. This efficiency gain is amplified through KV cache sharing across multiple requests. In particular, prompts with common prefixes, such as shared dialogue history or structured prompting patterns, enable substantial throughput improvements and latency reduction. Consequently, KV -cache sharing has become a critical mechanism for boosting throughput and reducing response latency in large-scale, multi-user LLM deployments. Empirical studies confirm that a substantial portion of real-world prompts exhibit prefix-level or structural overlap [42], [74], making shared KV reuse both practical and highly beneficial. Despite these performance benefits, KV cache sharing raises serious privacy and security concerns in shared or multi-tenant deployments. Specifically, KV -cache sharing across mutually untrusted users can lead to unintended information leakage.
- North America > United States > Connecticut (0.04)
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- (3 more...)
Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models
Side-channel attacks on shared hardware resources increasingly threaten confidentiality, especially with the rise of Large Language Models (LLMs). In this work, we introduce Spill The Beans, a novel application of cache side-channels to leak tokens generated by an LLM. By co-locating an attack process on the same hardware as the victim model, we flush and reload embedding vectors from the embedding layer, where each token corresponds to a unique embedding vector. When accessed during token generation, it results in a cache hit detectable by our attack on shared lower-level caches. A significant challenge is the massive size of LLMs, which, by nature of their compute intensive operation, quickly evicts embedding vectors from the cache. We address this by balancing the number of tokens monitored against the amount of information leaked. Monitoring more tokens increases potential vocabulary leakage but raises the chance of missing cache hits due to eviction; monitoring fewer tokens improves detection reliability but limits vocabulary coverage. Through extensive experimentation, we demonstrate the feasibility of leaking tokens from LLMs via cache side-channels. Our findings reveal a new vulnerability in LLM deployments, highlighting that even sophisticated models are susceptible to traditional side-channel attacks. We discuss the implications for privacy and security in LLM-serving infrastructures and suggest considerations for mitigating such threats. For proof of concept we consider two concrete attack scenarios: Our experiments show that an attacker can recover as much as 80%-90% of a high entropy API key with single shot monitoring. As for English text we can reach a 40% recovery rate with a single shot. We should note that the rate highly depends on the monitored token set and these rates can be improved by targeting more specialized output domains.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Learning to Localize Leakage of Cryptographic Sensitive Variables
Gammell, Jimmy, Raghunathan, Anand, Hashemi, Abolfazl, Roy, Kaushik
While cryptographic algorithms such as the ubiquitous Advanced Encryption Standard (AES) are secure, *physical implementations* of these algorithms in hardware inevitably 'leak' sensitive data such as cryptographic keys. A particularly insidious form of leakage arises from the fact that hardware consumes power and emits radiation in a manner that is statistically associated with the data it processes and the instructions it executes. Supervised deep learning has emerged as a state-of-the-art tool for carrying out *side-channel attacks*, which exploit this leakage by learning to map power/radiation measurements throughout encryption to the sensitive data operated on during that encryption. In this work we develop a principled deep learning framework for determining the relative leakage due to measurements recorded at different points in time, in order to inform *defense* against such attacks. This information is invaluable to cryptographic hardware designers for understanding *why* their hardware leaks and how they can mitigate it (e.g. by indicating the particular sections of code or electronic components which are responsible). Our framework is based on an adversarial game between a family of classifiers trained to estimate the conditional distributions of sensitive data given subsets of measurements, and a budget-constrained noise distribution which probabilistically erases individual measurements to maximize the loss of these classifiers. We demonstrate our method's efficacy and ability to overcome limitations of prior work through extensive experimental comparison with 8 baseline methods using 3 evaluation metrics and 6 publicly-available power/EM trace datasets from AES, ECC and RSA implementations. We provide an open-source PyTorch implementation of these experiments.
- Europe > Germany (0.28)
- Asia > India (0.28)
- North America > United States > Indiana > Tippecanoe County (0.14)
- (3 more...)
Unveiling ECC Vulnerabilities: LSTM Networks for Operation Recognition in Side-Channel Attacks
Battistello, Alberto, Bertoni, Guido, Corrias, Michele, Nava, Lorenzo, Rusconi, Davide, Zoia, Matteo, Pierazzi, Fabio, Lanzi, Andrea
We propose a novel approach for performing side-channel attacks on elliptic curve cryptography. Unlike previous approaches and inspired by the ``activity detection'' literature, we adopt a long-short-term memory (LSTM) neural network to analyze a power trace and identify patterns of operation in the scalar multiplication algorithm performed during an ECDSA signature, that allows us to recover bits of the ephemeral key, and thus retrieve the signer's private key. Our approach is based on the fact that modular reductions are conditionally performed by micro-ecc and depend on key bits. We evaluated the feasibility and reproducibility of our attack through experiments in both simulated and real implementations. We demonstrate the effectiveness of our attack by implementing it on a real target device, an STM32F415 with the micro-ecc library, and successfully compromise it. Furthermore, we show that current countermeasures, specifically the coordinate randomization technique, are not sufficient to protect against side channels. Finally, we suggest other approaches that may be implemented to thwart our attack.
- Europe > Italy > Lombardy > Milan (0.04)
- South America > Brazil > São Paulo > Campinas (0.04)
- North America > United States > Texas > Schleicher County (0.04)
- (2 more...)
- Research Report > Promising Solution (0.48)
- Overview > Innovation (0.34)
Improving Location-based Thermal Emission Side-Channel Analysis Using Iterative Transfer Learning
Lou, Tun-Chieh, Wang, Chung-Che, Jang, Jyh-Shing Roger, Li, Henian, Lin, Lang, Chang, Norman
This paper proposes the use of iterative transfer learning applied to deep learning models for side-channel attacks. Currently, most of the side-channel attack methods train a model for each individual byte, without considering the correlation between bytes. However, since the models' parameters for attacking different bytes may be similar, we can leverage transfer learning, meaning that we first train the model for one of the key bytes, then use the trained model as a pretrained model for the remaining bytes. This technique can be applied iteratively, a process known as iterative transfer learning. Experimental results show that when using thermal or power consumption map images as input, and multilayer perceptron or convolutional neural network as the model, our method improves average performance, especially when the amount of data is insufficient.
- North America > United States (0.28)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > India > Telangana > Hyderabad (0.04)
A Hard-Label Cryptanalytic Extraction of Non-Fully Connected Deep Neural Networks using Side-Channel Attacks
Coqueret, Benoit, Carbone, Mathieu, Sentieys, Olivier, Zaid, Gabriel
During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects. However despite their high value and public accessibility, the protection of the intellectual property of DNNs is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings, proving that it was possible to copy a DNN with high fidelity, i.e., high similitude in the output predictions. However, the current cryptanalytic attacks cannot target complex, i.e., not fully connected, DNNs and are limited to special cases of neurons present in deep networks. In this work, we introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity. We describe a new black-box side-channel attack which splits the DNN in several linear parts for which we can perform cryptanalytic extraction and retrieve the weights in hard-label settings. With this method, we are able to adapt cryptanalytic extraction, for the first time, to non-fully connected DNNs, while maintaining a high fidelity. We validate our contributions by targeting several architectures implemented on a microcontroller unit, including a Multi-Layer Perceptron (MLP) of 1.7 million parameters and a shortened MobileNetv1. Our framework successfully extracts all of these DNNs with high fidelity (88.4% for the MobileNetv1 and 93.2% for the MLP). Furthermore, we use the stolen model to generate adversarial examples and achieve close to white-box performance on the victim's model (95.8% and 96.7% transfer rate).
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (7 more...)
Power side-channel leakage localization through adversarial training of deep neural networks
Gammell, Jimmy, Raghunathan, Anand, Roy, Kaushik
Supervised deep learning has emerged as an effective tool for carrying out power side-channel attacks on cryptographic implementations. While increasingly-powerful deep learning-based attacks are regularly published, comparatively-little work has gone into using deep learning to defend against these attacks. In this work we propose a technique for identifying which timesteps in a power trace are responsible for leaking a cryptographic key, through an adversarial game between a deep learning-based side-channel attacker which seeks to classify a sensitive variable from the power traces recorded during encryption, and a trainable noise generator which seeks to thwart this attack by introducing a minimal amount of noise into the power traces. We demonstrate on synthetic datasets that our method can outperform existing techniques in the presence of common countermeasures such as Boolean masking and trace desynchronization. Results on real datasets are weak because the technique is highly sensitive to hyperparameters and early-stop point, and we lack a holdout dataset with ground truth knowledge of leaking points for model selection. Nonetheless, we believe our work represents an important first step towards deep side-channel leakage localization without relying on strong assumptions about the implementation or the nature of its leakage. An open-source PyTorch implementation of our experiments is provided.
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- (5 more...)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Games > Computer Games (0.34)